Goto

Collaborating Authors

 target ship


2-Level Reinforcement Learning for Ships on Inland Waterways

Waltz, Martin, Paulig, Niklas, Okhrin, Ostap

arXiv.org Artificial Intelligence

This paper proposes a realistic modularized framework for controlling autonomous surface vehicles (ASVs) on inland waterways (IWs) based on deep reinforcement learning (DRL). The framework comprises two levels: a high-level local path planning (LPP) unit and a low-level path following (PF) unit, each consisting of a DRL agent. The LPP agent is responsible for planning a path under consideration of nearby vessels, traffic rules, and the geometry of the waterway. We thereby transfer a recently proposed spatial-temporal recurrent neural network architecture to continuous action spaces. The LPP agent improves operational safety in comparison to a state-of-the-art artificial potential field method by increasing the minimum distance to other vessels by 65% on average. The PF agent performs low-level actuator control while accounting for shallow water influences and the environmental forces winds, waves, and currents. Compared with a proportional-integral-derivative (PID) controller, the PF agent yields only 61% of the mean cross-track error while significantly reducing control effort in terms of the required absolute rudder angle. Lastly, both agents are jointly validated in simulation, employing the lower Elbe in northern Germany as an example case and using real automatic identification system (AIS) trajectories to model the behavior of other ships.


Spatial-temporal recurrent reinforcement learning for autonomous ships

Waltz, Martin, Okhrin, Ostap

arXiv.org Artificial Intelligence

This paper proposes a spatial-temporal recurrent neural network architecture for deep $Q$-networks that can be used to steer an autonomous ship. The network design makes it possible to handle an arbitrary number of surrounding target ships while offering robustness to partial observability. Furthermore, a state-of-the-art collision risk metric is proposed to enable an easier assessment of different situations by the agent. The COLREG rules of maritime traffic are explicitly considered in the design of the reward function. The final policy is validated on a custom set of newly created single-ship encounters called `Around the Clock' problems and the commonly used Imazu (1987) problems, which include 18 multi-ship scenarios. Performance comparisons with artificial potential field and velocity obstacle methods demonstrate the potential of the proposed approach for maritime path planning. Furthermore, the new architecture exhibits robustness when it is deployed in multi-agent scenarios and it is compatible with other deep reinforcement learning algorithms, including actor-critic frameworks.


COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle using Deep Reinforcement Learning

Meyer, Eivind, Heiberg, Amalie, Rasheed, Adil, San, Omer

arXiv.org Artificial Intelligence

Path Following and Collision Avoidance, be it for unmanned surface vessels or other autonomous vehicles, are two fundamental guidance problems in robotics. For many decades, they have been subject to academic study, leading to a vast number of proposed approaches. However, they have mostly been treated as separate problems, and have typically relied on non-linear first-principles models with parameters that can only be determined experimentally. The rise of Deep Reinforcement Learning (DRL) in recent years suggests an alternative approach: end-to-end learning of the optimal guidance policy from scratch by means of a trial-and-error based approach. In this article, we explore the potential of Proximal Policy Optimization (PPO), a DRL algorithm with demonstrated state-of-the-art performance on Continuous Control tasks, when applied to the dual-objective problem of controlling an underactuated Autonomous Surface Vehicle in a COLREGs compliant manner such that it follows an a priori known desired path while avoiding collisions with other vessels along the way. Based on high-fidelity elevation and AIS tracking data from the Trondheim Fjord, an inlet of the Norwegian sea, we evaluate the trained agent's performance in challenging, dynamic real-world scenarios where the ultimate success of the agent rests upon its ability to navigate non-uniform marine terrain while handling challenging, but realistic vessel encounters.


US cyberattack brought down Iranian database used to target ships in Persian Gulf: reports

FOX News

Jennifer Griffin predicts Trump's military response to Iran shooting down a U.S. drone would be much different if an American had been injured or killed. Iran is still feeling the pain after U.S. cyber military forces brought down a database used by its Revolutionary Guard Corps to target ships in the Persian Gulf, hours after the Islamic Republic shot down an American drone, officials say. The retaliatory cyberattack on June 20 focused on a system that Iran uses to determine which oil tankers and marine traffic it should go after, a senior U.S. official told the New York Times. As of Thursday, Iran has yet to recover all of the data lost in the attack and is trying to restore military communication networks linked to the database, the newspaper added. President Trump reportedly signed off on the U.S. Cyber Command's strike though the government has not publicly acknowledged it happened, according to the Washington Post.